AITopics | mc sample

Collaborating Authors

mc sample

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f56de5ef149cf0aedcc8f4797031e229-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 03:21:27 GMT

artificial intelligence, point process, sequence, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Partially Observable Gaussian Process Network and Doubly Stochastic Variational Inference

Kiroriwal, Saksham, Pfrommer, Julius, Beyerer, Jürgen

arXiv.org Artificial IntelligenceFeb-19-2025

To reduce the curse of dimensionality for Gaussian processes (GP), they can be decomposed into a Gaussian Process Network (GPN) of coupled subprocesses with lower dimensionality. In some cases, intermediate observations are available within the GPN. However, intermediate observations are often indirect, noisy, and incomplete in most real-world systems. This work introduces the Partially Observable Gaussian Process Network (POGPN) to model real-world process networks. We model a joint distribution of latent functions of subprocesses and make inferences using observations from all subprocesses. POGPN incorporates observation lenses (observation likelihoods) into the well-established inference method of deep Gaussian processes. We also introduce two training methods for POPGN to make inferences on the whole network using node observations. The application to benchmark problems demonstrates how incorporating partial observations during training and inference can improve the predictive performance of the overall network, offering a promising outlook for its practical application.

likelihood, node, pogpn, (13 more...)

arXiv.org Artificial Intelligence

2502.13905

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Kazemnejad, Amirhossein, Aghajohari, Milad, Portelance, Eva, Sordoni, Alessandro, Reddy, Siva, Courville, Aaron, Roux, Nicolas Le

arXiv.org Artificial IntelligenceOct-2-2024

Large language models (LLMs) are increasingly applied to complex reasoning tasks that require executing several complex steps before receiving any reward. Properly assigning credit to these steps is essential for enhancing model performance. Proximal Policy Optimization (PPO), a state-of-the-art reinforcement learning (RL) algorithm used for LLM finetuning, employs value networks to tackle credit assignment. However, value networks face challenges in predicting the expected cumulative rewards accurately in complex reasoning tasks, often leading to high-variance updates and suboptimal performance. In this work, we systematically evaluate the efficacy of value networks and reveal their significant shortcomings in reasoning-heavy LLM tasks, showing that they barely outperform a random baseline when comparing alternative steps. To address this, we propose VinePPO, a straightforward approach that leverages the flexibility of language environments to compute unbiased Monte Carlo-based estimates, bypassing the need for large value networks. Our method consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets with fewer gradient updates (up to 9x), less wall-clock time (up to 3.0x). These results emphasize the importance of accurate credit assignment in RL finetuning of LLM and demonstrate VinePPO's potential as a superior alternative.

credit assignment, dataset, vineppo, (14 more...)

arXiv.org Artificial Intelligence

2410.01679

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(10 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Uncertainty-Informed Volume Visualization using Implicit Neural Representation

Saklani, Shanu, Goel, Chitwan, Bansal, Shrey, Wang, Zhe, Dutta, Soumya, Athawale, Tushar M., Pugmire, David, Johnson, Christopher R.

arXiv.org Artificial IntelligenceAug-12-2024

The increasing adoption of Deep Neural Networks (DNNs) has led to their application in many challenging scientific visualization tasks. While advanced DNNs offer impressive generalization capabilities, understanding factors such as model prediction quality, robustness, and uncertainty is crucial. These insights can enable domain scientists to make informed decisions about their data. However, DNNs inherently lack ability to estimate prediction uncertainty, necessitating new research to construct robust uncertainty-aware visualization techniques tailored for various visualization tasks. In this work, we propose uncertainty-aware implicit neural representations to model scalar field data sets effectively and comprehensively study the efficacy and benefits of estimated uncertainty information for volume visualization tasks. We evaluate the effectiveness of two principled deep uncertainty estimation techniques: (1) Deep Ensemble and (2) Monte Carlo Dropout (MCDropout). These techniques enable uncertainty-informed volume visualization in scalar field data sets. Our extensive exploration across multiple data sets demonstrates that uncertainty-aware models produce informative volume visualization results. Moreover, integrating prediction uncertainty enhances the trustworthiness of our DNN model, making it suitable for robustly analyzing and visualizing real-world scientific volumetric data sets.

ensemble method, mcdropout method, visualization, (11 more...)

arXiv.org Artificial Intelligence

2408.06018

Country:

Asia > India > Uttar Pradesh > Kanpur (0.04)
North America > United States > Utah (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.82)

Industry:

Energy (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

When Monte-Carlo Dropout Meets Multi-Exit: Optimizing Bayesian Neural Networks on FPGA

Fan, Hongxiang, Chen, Hao, Castelli, Liam, Que, Zhiqiang, Li, He, Long, Kenneth, Luk, Wayne

arXiv.org Artificial IntelligenceAug-13-2023

Bayesian Neural Networks (BayesNNs) have demonstrated their capability of providing calibrated prediction for safety-critical applications such as medical imaging and autonomous driving. However, the high algorithmic complexity and the poor hardware performance of BayesNNs hinder their deployment in real-life applications. To bridge this gap, this paper proposes a novel multi-exit Monte-Carlo Dropout (MCD)-based BayesNN that achieves well-calibrated predictions with low algorithmic complexity. To further reduce the barrier to adopting BayesNNs, we propose a transformation framework that can generate FPGA-based accelerators for multi-exit MCD-based BayesNNs. Several novel optimization techniques are introduced to improve hardware performance. Our experiments demonstrate that our auto-generated accelerator achieves higher energy efficiency than CPU, GPU, and other state-of-the-art hardware implementations.

artificial intelligence, bayesnn, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.06849

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Greater London > London (0.05)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology (0.88)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Alpha-divergence Variational Inference Meets Importance Weighted Auto-Encoders: Methodology and Asymptotics

Daudel, Kamélia, Benton, Joe, Shi, Yuyang, Doucet, Arnaud

arXiv.org Artificial IntelligenceJul-19-2023

Several algorithms involving the Variational R\'enyi (VR) bound have been proposed to minimize an alpha-divergence between a target posterior distribution and a variational distribution. Despite promising empirical results, those algorithms resort to biased stochastic gradient descent procedures and thus lack theoretical guarantees. In this paper, we formalize and study the VR-IWAE bound, a generalization of the Importance Weighted Auto-Encoder (IWAE) bound. We show that the VR-IWAE bound enjoys several desirable properties and notably leads to the same stochastic gradient descent procedure as the VR bound in the reparameterized case, but this time by relying on unbiased gradient estimators. We then provide two complementary theoretical analyses of the VR-IWAE bound and thus of the standard IWAE bound. Those analyses shed light on the benefits or lack thereof of these bounds. Lastly, we illustrate our theoretical claims over toy and real-data examples.

artificial intelligence, machine learning, vr-iwae, (16 more...)

arXiv.org Artificial Intelligence

2210.06226

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Practical Preferential Bayesian Optimization with Skew Gaussian Processes

Takeno, Shion, Nomura, Masahiro, Karasuyama, Masayuki

arXiv.org Artificial IntelligenceJun-11-2023

We study preferential Bayesian optimization (BO) where reliable feedback is limited to pairwise comparison called duels. An important challenge in preferential BO, which uses the preferential Gaussian process (GP) model to represent flexible preference structure, is that the posterior distribution is a computationally intractable skew GP. The most widely used approach for preferential BO is Gaussian approximation, which ignores the skewness of the true posterior. Alternatively, Markov chain Monte Carlo (MCMC) based preferential BO is also proposed. In this work, we first verify the accuracy of Gaussian approximation, from which we reveal the critical problem that the predictive probability of duels can be inaccurate. This observation motivates us to improve the MCMC-based estimation for skew GP, for which we show the practical efficiency of Gibbs sampling and derive the low variance MC estimator. However, the computational time of MCMC can still be a bottleneck in practice. Towards building a more practical preferential BO, we develop a new method that achieves both high computational efficiency and low sample complexity, and then demonstrate its effectiveness through extensive numerical experiments.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.01513

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)

Add feedback

Massively Scaling Heteroscedastic Classifiers

Collier, Mark, Jenatton, Rodolphe, Mustafa, Basil, Houlsby, Neil, Berent, Jesse, Kokiopoulou, Effrosyni

arXiv.org Artificial IntelligenceJan-30-2023

Heteroscedastic classifiers, which learn a multivariate Gaussian distribution over prediction logits, have been shown to perform well on image classification problems with hundreds to thousands of classes. However, compared to standard classifiers, they introduce extra parameters that scale linearly with the number of classes. This makes them infeasible to apply to larger-scale problems. In addition heteroscedastic classifiers introduce a critical temperature hyperparameter which must be tuned. We propose HET-XL, a heteroscedastic classifier whose parameter count when compared to a standard classifier scales independently of the number of classes. In our large-scale settings, we show that we can remove the need to tune the temperature hyperparameter, by directly learning it on the training data. On large image classification datasets with up to 4B images and 30k classes our method requires 14X fewer additional parameters, does not require tuning the temperature on a held-out set and performs consistently better than the baseline heteroscedastic classifier. HET-XL improves ImageNet 0-shot classification in a multimodal contrastive learning setup which can be viewed as a 3.5 billion class classification problem.

artificial intelligence, het-xl, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2301.1286

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Filters

Collaborating Authors

mc sample

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

f56de5ef149cf0aedcc8f4797031e229-Supplemental.pdf

531230cfac80c65017ad0f85d3031edc-Supplemental-Conference.pdf

531230cfac80c65017ad0f85d3031edc-Supplemental-Conference.pdf

Partially Observable Gaussian Process Network and Doubly Stochastic Variational Inference

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Uncertainty-Informed Volume Visualization using Implicit Neural Representation

When Monte-Carlo Dropout Meets Multi-Exit: Optimizing Bayesian Neural Networks on FPGA

Alpha-divergence Variational Inference Meets Importance Weighted Auto-Encoders: Methodology and Asymptotics

Towards Practical Preferential Bayesian Optimization with Skew Gaussian Processes

Massively Scaling Heteroscedastic Classifiers